Method and apparatus for processing database data in distributed database system

ABSTRACT

A method and apparatus for processing database data in a distributed database system, wherein the distributed database system comprises a plurality of computing nodes communicatively coupled via computer networks, the method comprising: creating a plurality of different data replicas wherein each of the data replicas is created in the following way: sorting the database data according to at least one data attribute; generating a row key based on the at least one data attribute; and using the sorted database data with the row key as the data replica, storing different data replicas in different computing nodes; and creating an index for each of the data replicas according to its row key.

BACKGROUND ART

The present invention relates to distributed databases and, moreparticularly, to the processing of data of distributed databases.

Databases are widely applied to fields like e-business, socialnetworking, internet searching and internet of things (IOT), etc.Databases include relational database and non-relational database. In arelational database, a table is a formatted data structure. The fieldcomposition is same for all tuples in a table. Although not all fieldsare needed for all tuples, the database will allocate all fields foreach tuple. Such a structure may facilitate operations like linking onetable with another.

A non-relational database stores information with tuples consisting ofkey-value pairs. Its structure is not fixed and different tuples mayhave different fields. Each tuple may have additional key-value pairsfor itself on a need basis, thus it is not limited by a fixed structure.Therefore, non-relational databases have the feature of goodscalability. Because of the feature, non-relational databases have alsobeen developed rapidly.

Both relational databases and non-relational databases may storemulti-dimensional data. Examples of multi-dimensional data includemeasurement data of sensors, such as temperature value and wind speedvalue measured at different points of time. In querying database, if aparticular sensor and a time recorded by the particular sensor are to bequeried simultaneously, the sensor's device-id and the time constitute atwo-dimensional data. If a particular sensor and a time and atemperature recorded by the particular sensor are to be queriedsimultaneously, the sensor's device-id, the time and the temperatureconstitute a three-dimensional data.

In the prior art there exist techniques for processing multi-dimensionaldata, including techniques of indexing, storage and querying withrespect to multi-dimensional data. However, when applying thesetechniques to process multi-dimensional data, extra computing resourcesneed to be consumed for higher efficiency. With the rapid increase inthe amount of database data processed by various applications, it isincreasingly important to mitigate the contradiction.

SUMMARY OF THE INVENTION

In view of the prior art, one of the objectives of the present inventionis to provide an improved method and apparatus for processing databasedata in distributed database systems.

In one aspect, it is disclosed method for processing database data in adistributed database system, wherein the distributed database systemcomprises a plurality of computing nodes communicatively coupled viacomputer networks, the method comprising: creating a plurality ofdifferent data replicas wherein each of the data replicas is created inthe following way: sorting the database data according to at least onedata attribute; generating a row key based on the at least one dataattribute; and using the sorted database data with the row key as thedata replica, storing different data replicas in different computingnodes; and creating an index for each of the data replicas according toits row key.

In another aspect, it is disclosed an apparatus for processing databasedata in a distributed database system, wherein the distributed databasesystem comprises a plurality of computing nodes communicatively coupledvia computer networks, the apparatus comprising: a data replica creationmodule configured to create a plurality of different data replicaswherein each of the data replicas is created in the following way:sorting the database data according to at least one data attribute;generating a row key based on the at least one data attribute; and usingthe sorted database data with the row key as the data replica, a replicastorage module configured to store different data replicas in differentcomputing nodes; and an index creation module configured to create anindex for each of the data replicas according to its row key.

BRIEF DESCRIPTION OF THE DRAWINGS

The present disclosure of the above and other objects, features andadvantages will become more apparent by the more detailed description ofan exemplary embodiment of the present disclosure in conjunction withthe accompanying drawings, wherein, in the present embodiment disclosesan exemplary embodiment, the same reference numerals typically representsame member.

FIG. 1 depicts a block diagram of an exemplary computing system 100adapted to be used to implement embodiments of the present invention;

FIG. 2 illustratively depicts a distributed database system according toan embodiment of the present invention;

FIG. 3 illustratively depicts two examples of database data;

FIGS. 4A-4C illustratively depict data replicas in accordance with anembodiment of the present inventions;

FIG. 5 illustratively depicts an index of a data replica in accordancewith an embodiment of the present inventions;

FIG. 6 schematically shows a flowchart of a method according to anembodiment of the present invention; and

FIG. 7 schematically depicts a block diagram of an apparatus accordingto an embodiment of the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Preferred embodiments of the present disclosure will be described ingreater detail below with reference to the accompanying drawings. Theaccompanying drawings have shown those preferred embodiments of thepresent disclosure, however, it should be understood that, the presentdisclosure can be implemented in various forms, but are not limited tothese embodiments illustrated herein. On the contrary, these embodimentsare provided for making the present disclosure more thorough andcomplete, such that the scope of the present disclosure can becompletely delivered to one of ordinary skill in the art.

FIG. 1 shows a block diagram of an exemplary computing system 100 whichis applicable to implement the embodiments of the present invention. Asshown in FIG. 1, the computing system 100 may include: CPU (CentralProcessing Unit) 101, RAM (Random Access Memory) 102, ROM (Read OnlyMemory) 103, System Bus 104, Hard Drive Controller 105, KeyboardController 106, Serial Interface Controller 107, Parallel InterfaceController 108, Display Controller 109, Hard Drive 110, Keyboard 111,Serial Peripheral Equipment 112, Parallel Peripheral Equipment 113 andDisplay 114. Among above devices, CPU 101, RAM 102, ROM 103, Hard DriveController 105, Keyboard Controller 106, Serial Interface Controller107, Parallel Interface Controller 108 and Display Controller 109 arecoupled to the System Bus 104. Hard Drive 110 is coupled to Hard DriveController 105. Keyboard 111 is coupled to Keyboard Controller 106.Serial Peripheral Equipment 112 is coupled to Serial InterfaceController 107. Parallel Peripheral Equipment 113 is coupled to ParallelInterface Controller 108. And, Display 114 is coupled to DisplayController 109. It should be understood that the structure as shown inFIG. 1 is only for the exemplary purpose rather than any limitation tothe present invention. In some cases, some devices may be added to orremoved from the computer system 100 based on specific situations.

As will be appreciated by one of ordinary skill in the art, aspects ofthe present invention may be embodied as a system, method or computerprogram product. Accordingly, aspects of the present invention may takethe form of an entirely hardware embodiment, an entirely softwareembodiment (including firmware, resident software, micro-code, etc.) oran embodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present invention may take the form of acomputer program product embodied in one or more computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may beutilized. The computer readable medium may be a computer readable signalmedium or a computer readable storage medium. A computer readablestorage medium may be, for example, but not limited to, an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system,apparatus, or device, or any suitable combination of the foregoing. Morespecific examples (a non-exhaustive list) of the computer readablestorage medium would include the following: an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk, arandom access memory (RAM), a read-only memory (ROM), an erasableprogrammable read-only memory (EPROM or Flash memory), an optical fiber,a portable compact disc read-only memory (CD-ROM), an optical storagedevice, a magnetic storage device, or any suitable combination of theforegoing. In the context of this document, a computer readable storagemedium may be any tangible medium that can contain, or store a programfor use by or in connection with an instruction execution system,apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that can communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wire, optical fiber cable, RF, etc., or any suitable combination of theforegoing.

Computer program code for carrying out operations for aspects of thepresent invention may be written in any combination of one or moreprogramming languages, including an object oriented programming languagesuch as Java, Smalltalk, C++ or the like and conventional proceduralprogramming languages, such as the “C” programming language or similarprogramming languages. The program code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

Aspects of the present invention are described below with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems) and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer program instructions. These computer program instructions maybe provided to a processor of a general purpose computer, specialpurpose computer, or other programmable data processing apparatus toproduce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computerreadable medium that can direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions whichimplement the function/act specified in the flowchart and/or blockdiagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

The flowcharts and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof code, which comprises one or more executable instructions forimplementing the specified logical function(s). It should also be notedthat, in some alternative implementations, the functions noted in theblock may occur out of the order noted in the figures. For example, twoblocks shown in succession may, in fact, be executed substantiallyconcurrently, or the blocks may sometimes be executed in the reverseorder, depending upon the functionality involved. It will also be notedthat each block of the block diagrams and/or flowchart illustration, andcombinations of blocks in the block diagrams and/or flowchartillustration, can be implemented by special purpose hardware-basedsystems that perform the specified functions or acts, or combinations ofspecial purpose hardware and computer instructions.

The invention relates to the processing of database data in adistributed database system. A computing node in the distributeddatabase system may be implemented by the computer system 100 shown inFIG. 1. The processing of database data and the application of databasemay also be performed on the computer system 100 shown in FIG. 1.

Those skilled in the art shall appreciate that data of a distributeddatabase are physically dispersive and stored at different nodes (or“sites”) of the distributed database system. The data on individualnodes communicatively connected via computer networks are under unifiedmanagement of the distributed database management system. Therefore, thedistributed database is logically a unified entirety, and applicationsmay access to geographically distributed databases via networkconnections.

The distributed database stores multiple data replicas on multiplenodes, so that each data item has at least one copy stored on othernodes. For example, if there are two copies for a data item D1, namelydata item D1_R1 and data item D1_R2, then D1, D1_R1 and D1_R2 shall belocated at different nodes. It should be noted that, in the context ofthe present invention, the term “replica” is a relative concept. Forexample, in the above example, D1 and D1_R2 are also replicas of D1_R1,and D1 and D1_R1 are also replicas of D1_R2.

In short, data in a distributed database is redundant, which mayincrease the parallel degree of data usage and improve data availabilityin case of failure (for example, node failure or network failure) aswell.

The general idea of the present invention is utilizing the dataredundancy of the distributed database to process database data, so asto improve the efficiency in querying on database data without taking upadditional computing resources.

Refer first to FIG. 2, which illustratively shows a distributed databasesystem 200 according to an embodiment of the present invention. As anexample, the distributed database system 200 shown in FIG. 2 has threedata replicas 220_1, 220_2, 220_3 of database data stored on sixcomputing nodes, wherein the data replica 220_1 is stored on computingnodes 230_1 and 230_2, the data replica 220_2 is stored on computingnodes 230_3 and 230_4, and the data replica 220_3 is stored on computingnodes 230_5 and 230_6.

The computing nodes (hereinafter also briefly referred to as“node/nodes”) may be implemented with the computer system 100 as shownin FIG. 1. Nodes may be communicatively coupled with each other viacomputer networks (not shown). In FIG. 2 it is shown six nodes, which ismerely exemplary. In practice, there may be hundreds of nodes, and thenumber of nodes for storage of each data replica may also exceed two.

As shown, the data replicas 220_1, 220_2 and 220_3 respectively have acorresponding index 210_1, 210_2 and 210_3. According to an embodimentof the present invention, the index 210_1, 210_2 and 210_3 may be savedon other nodes, for example, on a master node (not shown) of thedistributed database system.

It should be noted that in FIG. 2 it is only shown three data replicas(hereinafter also briefly referred to as “replica/replicas”), whereasthe number of data replicas in an actual distributed database may beless (for example, two) or more than three.

According to an embodiment of the present invention, replicas 220_1,220_2 and 220_3 all contain the same data items, but they are sorted indifferent ways.

According to an embodiment of the present invention, the multiple datareplicas respectively have indices associated with sorting manners (alsoreferred to as “index data”) 210_1, 210_2 and 210_3. The indices may besaved (not shown) on the master node of the distributed database systemmaster.

Features of the data replicas and indices shown in FIG. 2 will bedescribed in detail below with reference to FIGS. 4A-4C and FIG. 5, and,with reference to FIG. 6, description will be provided on how to createthe data replicas and indices shown in FIG. 2. Prior to that, thedatabase that is applicable to the distributed database system 200 ofthe invention will be introduced first.

Refer to FIG. 3, which illustratively shows two examples of databasedata. The database data indicated by reference mark 300A is an exampleof relational database data. The data 300A record values of temperature,wind speed and humidity sampled by four sensor devices 0001-0004 duringa period of 2000.1.1-2011.12.31. “Device-id”, “Time”, “Temperature”,“Wind Speed” and “Humidity” in the first line are data attributes(briefly referred to as “attribute/attributes”). Other lines are alldata tuples. Each field of a data tuple is an attribute value. Forexample, in the tuple “0001,2011.12.31,8,4,7”, the field “0001” is avalue of the attribute “Device-id”, which represents a sensor deviceidentified by the Device-id of “000”.

The database data indicated by reference mark 300B is an example ofnon-relational database (e.g. NoSQL database) data. Each row in the data300B represents a data tuple, and each data tuple includes a “Row Key”field and several fields each consisting of a “<Key,Value>” pair. Forexample, “0001, <Device-id, 0001>, <Time, 2011.12.31>, <Temperature, 8>,<Wind Speed, 4>, <Humidity, 7>” is a tuple. The field “0001” in thetuple is a value of row key. For example, the field “<Device_id,0001>″indicates that the value of the attribute “Device_id” is “0001.”

The data 300A and data 300B shown in FIG. 3 are exemplary only, and areused to represent the well-known relational database and non-relationaldatabase, respectively. The database data of the distributed databasesystem shown in FIG. 2 may use the structure of the data 300A or thestructure of the data 300B as well. For convenience of description, onlythe data 300B will be taken as the example for illustrating variousembodiments of the present invention below. Apparently, for thoseskilled in the art who understand the relational database and thenon-relational database, it would not be difficult to apply the variousembodiments illustrated with the exemplary non-relational database data300B to the relational database data depicted by the data 300A.

Below, with reference to FIGS. 4A-4C, details of the data replicas220_1, 220_2 and 220_3 will be described.

FIG. 4A illustratively shows a data replica 410 created according to anembodiment of the invention.

Here it is assumed that the data replica 410 is used as the data replica220_1 shown in FIG. 2. The data replica 220_1 is stored on nodes 230_1and 230_2, so the replica 410 is divided into data sets 411 and 412which respectively represent two parts of the data replica 220_1 storedin the nodes 230_1 and 230_2.

The data replica 410 is said to be derived from the database data 300Bshown in FIG. 3. In other words, it is a data replica created for thedatabase data 300B.

It is to be noted first that, for simplicity, the data replica 410 shownin FIG. 4A uses a representation that looks different from but issubstantially same as that of the database data 300B.

For example, the representation of

Row Key (Device- Device- Wind id_Time) id Time Temperature SpeedHumidity 0001_2000.1.1 0001 2000.1.1 10 14in the data replica 410 is equivalent to the following representation.

Row Key (Device-id_Time), { Key, Value }  0001_2000.1.1,<Device-id,0001>,<Time,2000.1.1>,   <Temperature,10>,<WindSpeed,14>

As will be described with reference to FIG. 6, the data replica 410 maybe created for the database data 300B in the following manner.

The database data 300B is sorted according to attributes “Device-id” and“Time”. The row key “Device-id_Time” is then generated based on theattributes “Device-id” and “Time”. The database 300B thus sorted andhaving the generated row key “Device-id_Time” is used as the datareplica 410.

In this example, multiple sorting is conducted on the database data 300Baccording to two attributes of “Device-id” and “Time”. It is firstsorted by the attribute “Device-id” and then by the attribute “Time”.The concatenation of the attributes “Device-id” and “Time”,“Device-id_Time”, is used as the row key in place of the original rowkey “Device-id” of the database data 300B. For example, the first columnof the data replica 410, “0001_(—)2000.1.1”, is a value of row key(Device-id_Time).

Because the data replica 410 is generated by multiple sorting on thedata 300B, the date items or tuples in the data sets 411 and 412 arestored on the nodes 230_1 and 230_2 in accordance with the sorted orderas sequentially as possible. For example, in the data set 411, tupleswith the value of “Device-id” being “0001” will be stored continuouslyor adjacently in the memory of the node 230_1.

FIG. 4B schematically depicts another data replica 420 created accordingto an embodiment of the present invention.

Here it is assumed that the data replica 420 is used as the data replica220_2 shown in FIG. 2. The data replica 220_2 is stored on nodes 230_3and 230_4, so the replica 420 is divided into data sets 421 and 422,respectively representing two parts of the data replica 220_2 stored inthe nodes 230_3 and 230_4.

The data replica 420 is represented by the same format as data replica410, and it is another and different data replica created for thedatabase data 300B. The method of creating the data replica 420 issimilar to the creation of the data replica 410.

In this example, multiple sorting is carried out on the database data300B according to attributes “Time” and “Device-id”. It is first sortedby the attribute “Time” and then by the attribute “Device-id”. Theconcatenation of the attributes “Time” and “Device-id”,“Time_Device-id”, is used as the row key in place of the original rowkey “Device-id” of the database data 300B. For example, the first columnof the data replica 420, “2000.1.1_(—)0001”, is a value of the row key(Time_Device-id).

FIG. 4C schematically depicts another data replica 430 created accordingto an embodiment of the present invention.

Here it is assumed that the data replica 430 is used as the data replica220_3 shown in FIG. 2. So the replica 430 is depicted as data sets 431and 432 which respectively represent two parts of the data replica 220_3stored in the nodes 230_5 and 230_6.

In this example, in creating the data replica 430, sorting is conductedon the database data 300B only by a single attribute “Temperature” andthe attribute is used as the row key.

The three data replicas 410,420 and 430 of the database data 300B andthe manner in which they are created have been described above withreference to FIGS. 4A-4C. The three data replicas are examples of thedata replica 220_1, 220_2 and 220_3 shown in FIG. 2. As indicated inprevious description of the distributed database system 200 according toan embodiment of the invention, the data replicas 220_1, 220_2 and 220_3each have a corresponding index 210_1, 210_2 and 210_3. Referring toFIG. 5, the structure of such an index will be described below by way ofexample.

FIG. 5 schematically depicts an index of a data replica createdaccording to an embodiment of the present invention. What is indicatedby reference mark 510 in FIG. 5 is just a part of index created on thedata replica 410 according to the row key “Device-id_Time”. As shown,the index 510 is a three-layer B+ tree structured index. Reference mark511 indicates the root node in the first layer, reference mark 512indicates a plurality of intermediate nodes in the second layer andreference mark 513 indicates a plurality of leaf nodes in the thirdlayer. As shown, each intermediate node of the index 510 represents arange of data of the data replica 410. For example,“0001_(—)2000.1.1-0001_(—)2011.12.31” represents all the data of whichthe value of the row key falls into the range 0001_(—)2000.1.1 to0001_(—)2011.12.31. Each leaf node of the index 510 represents a datablock on the disk that can be quickly located and read out. For example,the leaf node 521 represents a data block on the disk in the computingnode 230_1 that stores the data replica 410. With such B+ treestructured indices, desired data blocks in the storage may be quicklylocated in response to requests for data query, insertion, update, anddeletion.

It is to be noted that those skilled in the art shall appreciate thatthe B+tree index structure as shown in FIG. 5 is only an example of,rather than a limitation to, the index according to embodiments of theinvention. Apparently, in implementing embodiments of the invention,other similar index structures may be employed.

Components of the distributed database system 200 and theirimplementations according to an embodiment of the present invention havebeen described above.

Refer now to FIG. 6, which schematically depicts a flowchart of a methodaccording to an embodiment of the present invention.

FIG. 6 shows a method for processing database data in a distributeddatabase system. Here, the distributed database system comprisesmultiple computing nodes that are communicatively coupled via computernetworks.

The processing of database data according to embodiment of the inventionmainly pertains to a configuration phase and a query phase. As shown, inthe instant embodiment, the process of processing distributed databasedata in the configuration phase comprises Steps 610 to 630.

It is assumed that initially a database system administrator hasdetermined the number of data replicas and the scheme for configuringthe data replicas on computing nodes based on available computingresources and requirements for application. For example, it isdetermined that three data replicas are needed, and the three datareplicas will be allocated to nodes 230_1 and 230_2, nodes 230_3 and230_4, and nodes 230_5 and 230_6.

In Step 610, data replicas are created for the database data, wherein aplurality of different data replicas are created for the database data,each of the data replicas being created in the following way:

sorting the database data according to at least one data attribute;

generating a row key based on the at least one data attribute; and

using the sorted database data with the row key as the data replica.

The manner of creating a data replica for database data has beenillustrated by way of example in previous paragraphs with reference toFIGS. 4A-4C. Thus, it is not repeatedly described here in detail.

According to an embodiment of the invention, multiple sorting may becarried out on the database data according to multiple data attributes.In that case, said generating a row key based on the at least one dataattribute comprises using the concatenation of the multiple dataattributes as the row key.

For example, in FIG. 3 it is shown that the row key of the database data300B database is “Device-id”. By carrying out multiple sorting on thedata 300B according to two data attributes “Device-id” and “Time” andusing the concatenation of the two attributes, “Device-id_Time”, as thenew row key to take the place of the original row key “Device-id”, thedata replica 410 is generated as a result.

In this case, the multiple sorting is carried out according to twoattributes “Device-id” and “Time”. However, the invention is not limitedto that. In fact, multiple sorting may be conducted according to threeor more attributes in the same way.

Of course, multiple sorting may be conducted on the database dataaccording to a single data attribute; in this case, the row key is thedata attribute.

According to an embodiment of the present invention, multiple sortingmay be conducted preferentially according to frequently queried dataattribute(s) so as to create a data replica. Taking the data 300B shownin FIG. 3 as example, according to historical recording of applications,“Device-id”, “Time” and “Temperature” are attributes frequently beingqueried. Thus the attributes “Device-id” and “Time” and “Temperature”may be selected as the basis on which multiple sorting is carried out onthe data 300B for generating a data replica.

After Step 610, the process proceeds to Step 620. In Step 620, differentdata replicas are stored into different computing nodes.

For example, as shown in FIG. 2, the data replica 410 is stored into thecomputing nodes 230_1 and 230_2. In addition, the data replica 420 isstored into the computing nodes 230_3 and 230_4, and the data replica430 is stored into the computing nodes 230_5 and 230_6.

After Step 620, the process proceeds to Step 630.

In Step 630, an index is created by row key for each data replica. Forexample, the index 510 shown in FIG. 5 is created for the data replica410 according to the row key “Device-id_Time” of the data replica 410.In addition, an index is created according to the row key“Time_Device-id” for the data replica 420, and an index is createdaccording to the row key “Temperature” for the data replica 430. Takingthe scenario shown in FIG. 2 for example, as a result of the executionof Step 630, indices 210_1, 210_2 and 210_3 are created for the datareplicas 220_1, 220_2 and 220_3, respectively.

Each of the created indices contains mapping relationship between dataitems in the corresponding data replica and their positions in thecomputing node. Because data in different data replicas are sorted indifferent ways, each data replica's corresponded index is alsodifferent. According to an embodiment of the invention, each of thecreated indices will be stored, for example, in the master node of thedistributed database system 200.

In accordance with an embodiment of the invention, after an index iscreated for a data replica, the association relationship between theindex and the computing node storing its associated data replica is alsostored. For example, the index 510 in FIG. 5 is associated with thecomputing nodes 230_1 and 230_2 that store the data replica 410. Throughthe stored association relationship, it may be convenient to identify onwhich computing node(s) an index's corresponding data replica resides.

After Step 630, the configuration phase is over, and the distributeddatabase system 200 enters into a ready state. As shown in FIG. 2, atthis time, the data replica 220_1 stored on the computing nodes 230_1and 230_2 is the data replica 410 created in Step 620. The data replica220_2 stored on the computing nodes 230_3 and 230_4 is the data replica420 created in Step 620. The data replica 220_3 stored on the computingnodes 230_5 and 230_6 is the data replica 430 created in Step 620. Andthe indices 210_1, 210_2 and 230_3 created in Step 630 are also stored.

In the ready state, queries from an application may be accepted.According to an embodiment of the invention, in response to receiving adata query request as indicated by the decision block 632, thedistributed database system 200 will carry out a query on a data replicathat matches with the data query request (650), and returns the queryresult to the application initiating the data query request (660).

Because the system has stored data replicas sorted in different ways ondifferent nodes, depending on different query conditions in the queryrequest, it may carry out query on the data replica that matches withthe data query request so as to improve querying speed.

For example, the query of a query request is:

Select Temperature where Device-id=0001 AND 2009.1.1>Time>2006.1.1  (Query-1)

The data replica 410 matches with the query condition

“Device-id=0001 AND 2009.1.1>Time>2006.1.1”

in the Query-1. Therefore, the query is carried out on the data replica410.

The query of a query request is:

Select Temperature where Device-id=*AND Time=2011.12.31   (Query-2)

The data replica 420 matches with the query condition

“Device-id=*AND Time=2011.12.31”

in the Query-2. Therefore, the query is carried out on the data replica420.

The query of a query request is:

Select Humidity where Temperature>10   (Query-3)

The data replica 430 matches with the query condition

“Temperature>10”

in the Query-3. Therefore, the query is carried out on the data replica430.

As shown, according to an embodiment of the invention, the step ofcarrying out query on a data replica that matches with the data queryrequest comprises the following Steps 641 and 642.

In Step 641, a matching degree between the row key of each datareplica's index and the query condition of the data query request iscalculated.

The matching degree between the row key of a data replica's index with aquery condition is defined as follows:

Matching degree=<the number of attributes commonly contained in thequery request and in the index>/<the number of attributes contained inthe row key>

wherein, if the query condition contains a range of values of aparticular attribute A and the row key also contains the attribute A,then, both will be deemed to be same only if the attribute A of the rowkey is located at the end of the row key.

For the Query-3, the query condition “Temperature>20” contains anattribute “Temperature”. The matching degree between the row key(Device-id_Time) of the index of data replica 410 and the querycondition is 0. The matching degree between the row key (Time_Device-id)of the index of data replica 420 and the query condition is 0. Thematching degree between the row key (Temperature) of the index of thedata replica 430 and the query condition is 3.

For the Query-1, the query condition is “Device-id=0001 AND2009.1.1>Time>2006.1.1”. The matching degree between the row key of theindex of data replica 410 and the query condition is 1. The matchingdegree between the row key of the index of data replica 420 and thequery condition is 0.5. The matching degree between the row key of theindex of data replica 430 and the query condition is 0.

For the Query-2, the query condition is “Device-id=*ANDTime=2011.12.31”. The matching degree between the row key of the indexof data replica 410 and the query condition is 0.5. The matching degreebetween the row key of the index of data replica 420 and the querycondition is 1. The matching degree between the row key of the index ofdata replica 430 and the query condition is 0.

In Step 642, the data replica matching with the data query request isdetermined according to the result calculated.

According to an embodiment of the invention, a data replica correspondedto the row key of a data replica's index having the largest matchingdegree with the data query request is determined to be the data replicamatching with the data query request.

For example, according to the result of Step 641, a query for theQuery-1 is carried out on the data replica 410. In the distributeddatabase system 200 shown in FIG. 2, that is equivalent to carrying outthe query on the data replica 220_1 and, in this case, the querycondition “Device-id=0001 AND 2009.1.1>Time>2006.1.1” will be convertedto the query condition “0001_(—)2009.1.1>row key AND rowkey>0001_(—)2006.1.1”. Referring to FIG. 5, the query condition“0001_(—)2009.1.1>row key AND row key>0001_(—)2006.1.1” is matched withthe leaf node 521 of the index 510, so the position where thecorresponding data block 531 is stored may be located rapidly. If therewere not the index shown in FIG. 5, the querying process might spendlonger time. Therefore, this example illustrates that using the datareplica and index created according to embodiments of the invention mayimprove efficiency of querying. The effect would be particularly evidentfor multi-dimensional queries and multi-dimensional range queries.

In a similar manner, a query for the Query-2 is carried out on the datareplica 420, and a query for the Query-3 is carried out on the datareplica 430.

According to another embodiment of the present invention, for a certainquery, if there are multiple row keys of indices of data replicas withthe largest matching degree with a query condition, then the query maybe carried out on any one of the data replicas corresponding to the rowkeys.

In the ready state, data updating may be carried out on the distributeddatabase. According to an embodiment of the present invention, inresponse to receiving a request for data updating, each data replica andthe index of each data replica are updated. The updating of data replicaand index may be carried out in accordance with methods already known inthe prior art for data updating on distributed databases, so there is noneed to repeat detailed descriptions here.

Various embodiments of the method for processing database data in adistributed database system have been described above. Based on the sameinventive concept, the invention also provides an apparatus forprocessing database data in a distributed database system.

FIG. 7 schematically depicts an apparatus 700 for processing databasedata in a distributed database system according to an embodiment of thepresent invention.

The distributed database system 200 to which embodiments of theinvention are applied comprises a plurality of computing nodescommunicatively coupled via computer networks. As shown in FIG. 7, theapparatus 700 comprises a data replica creation module 710, a replicastorage module 720 and an index creation module 730. The functionalityand various embodiments of the modules are briefly described below.

The data replica creation module 710 is configured to create a pluralityof different data replicas wherein each of the data replicas is createdin the following way:

sorting the database data according to at least one data attribute;

generating a row key based on the at least one data attribute; and

using the sorted database data with the row key as the data replica.

The replica storage module 720 is configured to store different datareplicas in different computing nodes.

The index creation module 730 is configured to create an index for eachof the data replicas according to its row key.

According to an embodiment of the invention, the data replica creationmodule 710 is configured to carry out multiple sorting on the databasedata according to multiple data attributes and use the concatenation ofthe multiple data attributes as the row key.

According to an embodiment of the present invention, the apparatus 700further comprises a module (not shown) which is configured to store theindex.

According to an embodiment of the present invention, the apparatus 700further comprises a module (not shown) which is configured to store theassociation relationship between the index and the computing nodestoring its associated data replica.

According to an embodiment of the present invention, the device 700further comprises a query module 750, which is configured to carry out aquery on a data replica that matches with a data query request inresponse to receiving the data query request.

According to an embodiment of the present invention, the device 700further comprises a matching module 740, which is configured tocalculate a matching degree between the row key of each data replica'sindex and the query condition of the data query request, and determinethe data replica matching with the data query request according to theresult calculated.

According to an embodiment of the present invention, the matching moduleis configured to determine a data replica corresponded to the row key ofa data replica's index having the largest matching degree with the dataquery request to be the data replica matching with the data queryrequest.

According to an embodiment of the present invention, the apparatus 700further comprises a data updating module (not shown), which isconfigured to update each data replica and the index of each datareplica in response to receiving a request for data updating.

Embodiments of the apparatus for processing database data in adistributed database system are described above. Since embodiments ofthe method for processing database data in a distributed database systemhave been described in previous paragraphs, in the description of theapparatus, some duplicate contents in the description of the method forprocessing database data in a distributed database system are omitted.

Embodiments of the present invention utilize redundancy of a distributeddatabase system to deploy different data replicas at multiple computingnodes, which is helpful in improving query performance, especially theefficiency of multi-dimensional data queries.

Embodiments of the invention have been described. The above descriptionis only exemplary, rather than exhaustive or limited to the embodimentsdisclosed. Those skilled in the art shall appreciate that variousmodifications and alterations changes thereto may be readily made. Thechoice of terms herein is intended for best explaining the principle,practical application or improvement to the techniques in the market ofthe embodiments, or allowing those skilled in the art to understandvarious embodiments disclosed herein.

1. A method for processing database data in a distributed databasesystem, wherein the distributed database system comprises a plurality ofcomputing nodes communicatively coupled via computer networks, themethod comprising: creating a plurality of different data replicaswherein each of the data replicas is created in the following way:sorting the database data according to at least one data attribute;generating a row key based on the at least one data attribute; and usingthe sorted database data with the row key as the data replica, storingdifferent data replicas in different computing nodes; and creating anindex for each of the data replicas according to its row key.
 2. Themethod of claim 1, wherein said sorting the database data according toat least one data attribute comprises conducting multiple sorting on thedatabase data according to multiple data attributes, and said generatinga row key based on the at least one data attribute comprises using theconcatenation of the multiple data attributes as the row key.
 3. Themethod of claim 1, further comprising: storing the index and theassociation relationship between the index and the computing nodestoring its associated data replica.
 4. The method of claim 3, furthercomprising: carrying out a query on a data replica that matches with adata query request in response to receiving the data query request. 5.The method of claim 4, further comprising: calculating a matching degreebetween the row key of each data replica's index and the query conditionof the data query request; and determining the data replica matchingwith the data query request according to the result calculated.
 6. Themethod of claim 5, wherein a data replica corresponded to the row key ofa data replica's index having the largest matching degree with the dataquery request is determined to be the data replica matching with thedata query request.
 7. The method of claim 1, further comprising: inresponse to receiving a request for data updating, updating each datareplica and the index of each data replica.
 8. The method of claim 1,wherein the database data is non-relational database data.
 9. Anapparatus for processing database data in a distributed database system,wherein the distributed database system comprises a plurality ofcomputing nodes communicatively coupled via computer networks, theapparatus comprising: a data replica creation module, configured tocreate a plurality of different data replicas wherein each of the datareplicas is created in the following way: sorting the database dataaccording to at least one data attribute; generating a row key based onthe at least one data attribute; and using the sorted database data withthe row key as the data replica, a replica storage module, configured tostore different data replicas in different computing nodes; and an indexcreation module, configured to create an index for each of the datareplicas according to its row key.
 10. The apparatus of claim 9, whereinthe data replica creation module is configured to carry out multiplesorting on the database data according to multiple data attributes anduse the concatenation of the multiple data attributes as the row key.11. The apparatus of claim 9, further comprising: a module configured tostore the index and the association relationship between the index andthe computing node storing its associated data replica.
 12. Theapparatus of claim 11, further comprising: a query module configured tocarry out a query on a data replica that matches with a data queryrequest in response to receiving the data query request.
 13. Theapparatus of claim 12, further comprising a matching module configuredto calculate a matching degree between the row key of each datareplica's index and the query condition of the data query request; anddetermine the data replica matching with the data query requestaccording to the result calculated.
 14. The apparatus of claim 13,wherein the matching module is configured to determine a data replicacorresponded to the row key of a data replica's index having the largestmatching degree with the data query request to be the data replicamatching with the data query request.
 15. The apparatus of claim 9,further comprising: a data updating module configured to update eachdata replica and the index of each data replica in response to receivinga request for data updating.
 16. The apparatus of claim 9, wherein thedatabase data is non-relational database data.